🎯 CosmosDB Partitioning Strategies

📋 Overview

Partition key selection is the most critical design decision in Azure Cosmos DB. It directly impacts performance, scalability, cost, and query efficiency. Unlike traditional databases where you can modify partition schemes after deployment, Cosmos DB partition keys are immutable - you cannot change them without recreating the container and migrating data.

This article provides comprehensive guidance on choosing the right partitioning strategy for different scenarios, with practical examples and performance considerations.

📚 Table of Contents

📋 Overview
🔍 Partitioning Fundamentals
- Logical vs Physical Partitions
- Partition Key Properties
- Distribution Mechanics
🎯 Core Partitioning Strategies
- Entity ID-Based Partitioning
- Time-Based Partitioning
- Category/Type-Based Partitioning
- Hybrid Approaches
- Synthetic Key Strategies
📊 Strategy Comparison Matrix
🎮 Scenario-Based Recommendations
- High-Volume Applications
- Multi-Tenant Systems
- Time-Series Data
- Document Management
- E-commerce Platforms
- IoT Applications
⚠️ Anti-Patterns and Pitfalls
🔧 Implementation Guidelines
📈 Performance Optimization
🔍 Monitoring and Diagnostics
🚀 Advanced Partitioning Techniques
- Hot/Warm Architecture with TTL
- Single Collection vs Multiple Collections
- Near Real-Time Data Migration
- Collection Lifecycle Management
📝 APPENDIX: Partitioning for Example Feed Database

🔍 Partitioning Fundamentals

Logical vs Physical Partitions

// Logical Partition: All items with the same partition key value
public class BlogPost
{
    [JsonProperty("id")]
    public string Id { get; set; } = Guid.NewGuid().ToString();
    
    [JsonProperty("partitionKey")]
    public string PartitionKey { get; set; } // All posts with same value = 1 logical partition
    
    public string Title { get; set; }
    public string Content { get; set; }
    public DateTime PublishedDate { get; set; }
}

// Example logical partitions:
// Partition "user123" → Contains all blog posts by user123
// Partition "user456" → Contains all blog posts by user456
// Each logical partition can grow up to 20GB

Key Characteristics

Aspect	Logical Partition	Physical Partition
Definition	Items with same partition key	Physical storage unit
Size Limit	20GB maximum	Managed by Cosmos DB
Throughput	10,000 RU/s maximum	Shared across logical partitions
Distribution	Fixed by partition key	Dynamic, managed by service
Query Scope	Single partition queries are efficient	Cross-partition queries are expensive

Partition Key Properties

A good partition key should have:

🎯 High Cardinality: Many distinct values
⚖️ Even Distribution: Uniform data and request distribution
🔍 Query Alignment: Frequently used in WHERE clauses
📈 Future Growth: Accommodates scaling requirements
🚫 Immutability: Value rarely changes

🎯 Core Partitioning Strategies

1. Entity ID-Based Partitioning

Using entity identifiers (typically GUIDs) as partition keys.

public class Product
{
    [JsonProperty("id")]
    public string Id { get; set; } = Guid.NewGuid().ToString();
    
    [JsonProperty("partitionKey")]
    public string PartitionKey { get; set; } = Guid.NewGuid().ToString(); // Different from ID
    
    public string Name { get; set; }
    public decimal Price { get; set; }
    public string Category { get; set; }
}

// Alternative: Use ID as partition key (creates hyperfragmentation)
public class HyperfragmentedProduct
{
    [JsonProperty("id")]
    public string Id { get; set; } = Guid.NewGuid().ToString();
    
    [JsonProperty("partitionKey")]
    public string PartitionKey => Id; // BAD: Creates tiny partitions
}

✅ When to Use

Write-heavy workloads with minimal cross-item queries
Point read scenarios where you always know the exact ID
Uniform access patterns across all entities

❌ When to Avoid

Range queries or filtering by other properties
Aggregation queries across multiple items
Reporting scenarios requiring cross-partition analysis

2. Time-Based Partitioning

Using temporal dimensions for partition keys.

public class EventLog
{
    [JsonProperty("id")]
    public string Id { get; set; } = Guid.NewGuid().ToString();
    
    [JsonProperty("partitionKey")]
    public string PartitionKey { get; set; } // e.g., "2025-10", "2025-Q4", "2025-W42"
    
    public DateTime Timestamp { get; set; }
    public string EventType { get; set; }
    public string Source { get; set; }
    public object Data { get; set; }
}

public static class TimePartitionHelpers
{
    public static string GetMonthlyPartition(DateTime date)
        => date.ToString("yyyy-MM");
    
    public static string GetQuarterlyPartition(DateTime date)
    {
        int quarter = (date.Month - 1) / 3 + 1;
        return $"{date.Year}-Q{quarter}";
    }
    
    public static string GetWeeklyPartition(DateTime date)
    {
        var culture = CultureInfo.CurrentCulture;
        int weekOfYear = culture.Calendar.GetWeekOfYear(date, 
            CalendarWeekRule.FirstDay, DayOfWeek.Monday);
        return $"{date.Year}-W{weekOfYear:D2}";
    }
    
    public static string GetDailyPartition(DateTime date)
        => date.ToString("yyyy-MM-dd");
}

✅ When to Use

Time-series data with chronological access patterns
Log aggregation and analytics systems
Recent data prioritization scenarios
Natural archival requirements

❌ When to Avoid

Uniform temporal access across all historical data
Heavy write workloads concentrated in current time period

3. Category/Type-Based Partitioning

Using business categories or entity types as partition keys.

public class InventoryItem
{
    [JsonProperty("id")]
    public string Id { get; set; } = Guid.NewGuid().ToString();
    
    [JsonProperty("partitionKey")]
    public string PartitionKey { get; set; } // e.g., "electronics", "clothing", "books"
    
    public string Name { get; set; }
    public string Category { get; set; }
    public decimal Price { get; set; }
    public int StockQuantity { get; set; }
}

// Multi-tenant example
public class TenantDocument
{
    [JsonProperty("id")]
    public string Id { get; set; } = Guid.NewGuid().ToString();
    
    [JsonProperty("partitionKey")]
    public string PartitionKey { get; set; } // TenantId: "tenant-123", "tenant-456"
    
    public string TenantId { get; set; }
    public string DocumentType { get; set; }
    public object Content { get; set; }
}

✅ When to Use

Multi-tenant applications with tenant isolation
Category-based queries and analytics
Business domain segmentation
Access control requirements

❌ When to Avoid

Highly skewed category distributions
Frequent cross-category queries
Categories with unpredictable growth

4. Hybrid Approaches

Combining multiple dimensions for optimal distribution.

public class OrderItem
{
    [JsonProperty("id")]
    public string Id { get; set; } = Guid.NewGuid().ToString();
    
    [JsonProperty("partitionKey")]
    public string PartitionKey { get; set; } // "region_YYYY-MM" or "customerId_status"
    
    public string CustomerId { get; set; }
    public string Region { get; set; }
    public string Status { get; set; }
    public DateTime OrderDate { get; set; }
    public decimal Amount { get; set; }
}

public static class HybridPartitionStrategies
{
    // Geography + Time
    public static string GetGeoTimePartition(string region, DateTime date)
        => $"{region}_{date:yyyy-MM}";
    
    // Customer + Status
    public static string GetCustomerStatusPartition(string customerId, string status)
        => $"{customerId}_{status}";
    
    // Type + Time + Hash
    public static string GetDistributedPartition(string type, DateTime date, string id)
    {
        int hash = Math.Abs(id.GetHashCode()) % 10;
        return $"{type}_{date:yyyy-MM}_{hash:D2}";
    }
}

✅ When to Use

Complex query patterns requiring multiple access paths
Large datasets needing better distribution
Mixed workload scenarios

❌ When to Avoid

Simple, uniform access patterns
Small datasets that don’t require complex partitioning

5. Synthetic Key Strategies

Creating artificial partition keys for better distribution.

public class HighVolumeEvent
{
    [JsonProperty("id")]
    public string Id { get; set; } = Guid.NewGuid().ToString();
    
    [JsonProperty("partitionKey")]
    public string PartitionKey { get; set; } // Synthetic key for distribution
    
    public DateTime Timestamp { get; set; }
    public string EventType { get; set; }
    public string Source { get; set; }
    public object Payload { get; set; }
}

public static class SyntheticKeyStrategies
{
    // Hash-based distribution
    public static string GetHashedPartition(string sourceId, int buckets = 100)
    {
        int hash = Math.Abs(sourceId.GetHashCode()) % buckets;
        return $"bucket_{hash:D3}";
    }
    
    // Round-robin distribution
    private static int _roundRobinCounter = 0;
    public static string GetRoundRobinPartition(int buckets = 50)
    {
        int bucket = Interlocked.Increment(ref _roundRobinCounter) % buckets;
        return $"rr_{bucket:D2}";
    }
    
    // Time + Hash hybrid
    public static string GetTimeHashPartition(DateTime timestamp, string id, int hashBuckets = 10)
    {
        int hash = Math.Abs(id.GetHashCode()) % hashBuckets;
        return $"{timestamp:yyyy-MM}_{hash:D2}";
    }
}

✅ When to Use

Extremely high-volume scenarios
Hot partition problems
Uniform distribution requirements

❌ When to Avoid

Query patterns requiring specific partition targeting
Small to medium datasets

📊 Strategy Comparison Matrix

Strategy	Cardinality	Distribution	Query Efficiency	Complexity	Best For
Entity ID (GUID)	🟢 Very High	🟢 Perfect	🔴 Poor	🟢 Simple	Point reads only
Time-Based	🟡 Medium	🟡 Variable	🟢 Good	🟡 Medium	Time-series data
Category-Based	🔴 Low	🔴 Skewed	🟢 Excellent	🟢 Simple	Multi-tenant apps
Hybrid	🟢 High	🟢 Good	🟢 Good	🔴 Complex	Complex scenarios
Synthetic	🟢 Very High	🟢 Perfect	🔴 Poor	🔴 Very Complex	High-volume uniform

🎮 Scenario-Based Recommendations

High-Volume Applications

Scenario: Social media platform with millions of posts per day.

public class SocialPost
{
    [JsonProperty("id")]
    public string Id { get; set; } = Guid.NewGuid().ToString();
    
    [JsonProperty("partitionKey")]
    public string PartitionKey { get; set; } // Strategy: userId_YYYY-MM
    
    public string UserId { get; set; }
    public string Content { get; set; }
    public DateTime CreatedAt { get; set; }
    public List<string> Tags { get; set; }
    public int LikesCount { get; set; }
}

// Partition strategy for user timeline queries
public static string GetUserTimelinePartition(string userId, DateTime date)
    => $"{userId}_{date:yyyy-MM}";

Recommended Strategy: User + Time Hybrid

Benefits: Efficient user timeline queries, temporal distribution
Trade-offs: Cross-user queries require multiple partitions

Multi-Tenant Systems

Scenario: SaaS application serving multiple organizations.

public class TenantData
{
    [JsonProperty("id")]
    public string Id { get; set; } = Guid.NewGuid().ToString();
    
    [JsonProperty("partitionKey")]
    public string PartitionKey { get; set; } // Strategy: tenantId
    
    public string TenantId { get; set; }
    public string DataType { get; set; }
    public object Content { get; set; }
    public DateTime CreatedAt { get; set; }
}

Recommended Strategy: Tenant ID-Based

Benefits: Perfect tenant isolation, efficient tenant queries
Trade-offs: May need synthetic keys for large tenants

Time-Series Data

Scenario: IoT sensor data collection and analysis.

public class SensorReading
{
    [JsonProperty("id")]
    public string Id { get; set; } = Guid.NewGuid().ToString();
    
    [JsonProperty("partitionKey")]
    public string PartitionKey { get; set; } // Strategy: deviceId_YYYY-MM-DD
    
    public string DeviceId { get; set; }
    public string SensorType { get; set; }
    public double Value { get; set; }
    public DateTime Timestamp { get; set; }
    public string Location { get; set; }
}

public static string GetDeviceTimePartition(string deviceId, DateTime timestamp)
{
    // For high-frequency devices, use daily partitions
    // For low-frequency devices, use monthly partitions
    var readingsPerDay = GetEstimatedReadingsPerDay(deviceId);
    
    if (readingsPerDay > 1000)
        return $"{deviceId}_{timestamp:yyyy-MM-dd}";
    else
        return $"{deviceId}_{timestamp:yyyy-MM}";
}

Recommended Strategy: Device + Time Hybrid

Benefits: Device-specific queries, temporal analytics
Trade-offs: Complex cross-device aggregations

Document Management

Scenario: Enterprise document storage and retrieval system.

public class Document
{
    [JsonProperty("id")]
    public string Id { get; set; } = Guid.NewGuid().ToString();
    
    [JsonProperty("partitionKey")]
    public string PartitionKey { get; set; } // Strategy: department_docType
    
    public string Department { get; set; }
    public string DocumentType { get; set; }
    public string Title { get; set; }
    public string Author { get; set; }
    public DateTime CreatedDate { get; set; }
    public List<string> Tags { get; set; }
}

public static string GetDocumentPartition(string department, string docType)
    => $"{department}_{docType}";

Recommended Strategy: Department + Document Type

Benefits: Department-specific queries, document type analytics
Trade-offs: May need rebalancing if departments have different document volumes

E-commerce Platforms

Scenario: Online retail platform with product catalog and orders.

public class Product
{
    [JsonProperty("id")]
    public string Id { get; set; } = Guid.NewGuid().ToString();
    
    [JsonProperty("partitionKey")]
    public string PartitionKey { get; set; } // Strategy: category_brand
    
    public string Category { get; set; }
    public string Brand { get; set; }
    public string Name { get; set; }
    public decimal Price { get; set; }
    public int StockLevel { get; set; }
}

public class Order
{
    [JsonProperty("id")]
    public string Id { get; set; } = Guid.NewGuid().ToString();
    
    [JsonProperty("partitionKey")]
    public string PartitionKey { get; set; } // Strategy: customerId or region_YYYY-MM
    
    public string CustomerId { get; set; }
    public string Region { get; set; }
    public DateTime OrderDate { get; set; }
    public List<OrderItem> Items { get; set; }
    public decimal TotalAmount { get; set; }
}

Recommended Strategy:

Products: Category + Brand
Orders: Customer ID or Region + Time

IoT Applications

Scenario: Smart city infrastructure monitoring.

public class InfrastructureEvent
{
    [JsonProperty("id")]
    public string Id { get; set; } = Guid.NewGuid().ToString();
    
    [JsonProperty("partitionKey")]
    public string PartitionKey { get; set; } // Strategy: zone_deviceType_YYYY-MM
    
    public string Zone { get; set; }
    public string DeviceType { get; set; }
    public string DeviceId { get; set; }
    public DateTime Timestamp { get; set; }
    public string EventType { get; set; }
    public object Payload { get; set; }
}

public static string GetInfrastructurePartition(string zone, string deviceType, DateTime timestamp)
    => $"{zone}_{deviceType}_{timestamp:yyyy-MM}";

Recommended Strategy: Zone + Device Type + Time

Benefits: Geographic and temporal analytics, device type insights
Trade-offs: Complex cross-zone queries

⚠️ Anti-Patterns and Pitfalls

1. Hyperfragmentation (GUID Partition Keys)

// ❌ BAD: Creates tiny partitions
public class BadDocument
{
    [JsonProperty("id")]
    public string Id { get; set; } = Guid.NewGuid().ToString();
    
    [JsonProperty("partitionKey")]
    public string PartitionKey => Id; // Creates one partition per document
}

// ✅ GOOD: Logical grouping
public class GoodDocument
{
    [JsonProperty("id")]
    public string Id { get; set; } = Guid.NewGuid().ToString();
    
    [JsonProperty("partitionKey")]
    public string PartitionKey { get; set; } // Based on business logic
    
    public string Category { get; set; }
    public DateTime CreatedDate { get; set; }
}

2. Hot Partitions

// ❌ BAD: All current data goes to one partition
public static string GetHotPartition()
    => "current"; // All new data goes here

// ✅ GOOD: Distribute current load
public static string GetDistributedPartition(string id)
{
    int hash = Math.Abs(id.GetHashCode()) % 50;
    return $"current_{hash:D2}";
}

3. Low Cardinality

// ❌ BAD: Only a few possible values
public class LowCardinalityDoc
{
    [JsonProperty("partitionKey")]
    public string PartitionKey { get; set; } // "true" or "false" only
    
    public bool IsActive { get; set; }
}

// ✅ GOOD: Higher cardinality
public class HighCardinalityDoc
{
    [JsonProperty("partitionKey")]
    public string PartitionKey { get; set; } // Many possible user IDs
    
    public string UserId { get; set; }
    public bool IsActive { get; set; }
}

4. Frequently Changing Partition Keys

// ❌ BAD: Status changes frequently
public class BadOrder
{
    [JsonProperty("partitionKey")]
    public string PartitionKey => Status; // Changes during order lifecycle
    
    public string Status { get; set; } // "pending" → "shipped" → "delivered"
}

// ✅ GOOD: Stable partition key
public class GoodOrder
{
    [JsonProperty("partitionKey")]
    public string PartitionKey { get; set; } // Customer ID - doesn't change
    
    public string CustomerId { get; set; }
    public string Status { get; set; }
}

🔧 Implementation Guidelines

1. Partition Key Design Checklist

public class PartitionKeyValidator
{
    public static ValidationResult ValidatePartitionKey<T>(Expression<Func<T, string>> partitionKeyExpression)
    {
        var result = new ValidationResult();
        
        // Check 1: Cardinality estimation
        var estimatedCardinality = EstimateCardinality(partitionKeyExpression);
        if (estimatedCardinality < 100)
            result.Warnings.Add("Low cardinality detected - consider hybrid approach");
        
        // Check 2: Distribution analysis
        var distributionScore = AnalyzeDistribution(partitionKeyExpression);
        if (distributionScore < 0.7)
            result.Warnings.Add("Skewed distribution detected");
        
        // Check 3: Query alignment
        var queryAlignment = AnalyzeQueryPatterns(partitionKeyExpression);
        if (queryAlignment < 0.8)
            result.Warnings.Add("Partition key not aligned with common queries");
        
        return result;
    }
}

2. Dynamic Partition Strategy Selection

public class PartitionStrategySelector
{
    public static string SelectPartitionKey(DataCharacteristics characteristics)
    {
        return characteristics switch
        {
            { Volume: > 1_000_000, TemporalAccess: true } => 
                TimePartitionHelpers.GetMonthlyPartition(DateTime.UtcNow),
                
            { MultiTenant: true, TenantCount: < 1000 } => 
                characteristics.TenantId,
                
            { HighWriteVolume: true, UniformAccess: true } => 
                SyntheticKeyStrategies.GetHashedPartition(characteristics.EntityId),
                
            _ => GetDefaultPartition(characteristics)
        };
    }
}

3. Container Setup Best Practices

public static async Task<Container> CreateOptimizedContainer(
    Database database, 
    string containerId, 
    string partitionKeyPath,
    ContainerConfiguration config)
{
    var containerProperties = new ContainerProperties
    {
        Id = containerId,
        PartitionKeyPath = partitionKeyPath,
        
        // Optimized indexing policy
        IndexingPolicy = new IndexingPolicy
        {
            Automatic = true,
            IndexingMode = IndexingMode.Consistent,
            IncludedPaths = { new IncludedPath { Path = "/*" } },
            ExcludedPaths = config.ExcludedPaths.Select(p => new ExcludedPath { Path = p }).ToList(),
            CompositeIndexes = config.CompositeIndexes
        }
    };
    
    // Choose throughput model based on workload characteristics
    ThroughputProperties throughput = config.WorkloadType switch
    {
        WorkloadType.Steady => ThroughputProperties.CreateManualThroughput(config.BaseRUs),
        WorkloadType.Variable => ThroughputProperties.CreateAutoscaleThroughput(config.MaxRUs),
        WorkloadType.Bursts => ThroughputProperties.CreateAutoscaleThroughput(config.MaxRUs),
        _ => ThroughputProperties.CreateManualThroughput(400)
    };
    
    return await database.CreateContainerIfNotExistsAsync(containerProperties, throughput);
}

📈 Performance Optimization

1. Query Optimization Patterns

public class OptimizedQueries
{
    // ✅ GOOD: Single partition query
    public async Task<List<Order>> GetCustomerOrders(string customerId)
    {
        var query = new QueryDefinition(
            "SELECT * FROM c WHERE c.customerId = @customerId")
            .WithParameter("@customerId", customerId);
        
        return await ExecuteQuery(query, new PartitionKey(customerId));
    }
    
    // ⚠️ ACCEPTABLE: Cross-partition with filters
    public async Task<List<Order>> GetRecentOrdersByRegion(string region, DateTime since)
    {
        var partitions = GetTimePartitionsForDateRange(since, DateTime.UtcNow);
        var allResults = new List<Order>();
        
        foreach (var partition in partitions)
        {
            var query = new QueryDefinition(@"
                SELECT * FROM c 
                WHERE c.region = @region 
                AND c.orderDate >= @since")
                .WithParameter("@region", region)
                .WithParameter("@since", since);
            
            var results = await ExecuteQuery(query, new PartitionKey(partition));
            allResults.AddRange(results);
        }
        
        return allResults;
    }
    
    // ❌ AVOID: Full cross-partition scan
    public async Task<List<Order>> GetAllOrdersWithStatus(string status)
    {
        // This query hits ALL partitions - very expensive
        var query = new QueryDefinition(
            "SELECT * FROM c WHERE c.status = @status")
            .WithParameter("@status", status);
        
        return await ExecuteQuery(query); // No partition key = cross-partition
    }
}

2. Bulk Operations Optimization

public class BulkOperationOptimizer
{
    public async Task<BulkOperationResult> BulkInsertWithPartitionAwareness<T>(
        Container container, 
        IEnumerable<T> items,
        Func<T, string> partitionKeySelector)
    {
        // Group items by partition for optimal bulk operations
        var partitionGroups = items.GroupBy(item => partitionKeySelector(item));
        var results = new List<Task<ItemResponse<T>>>();
        
        foreach (var group in partitionGroups)
        {
            var partitionKey = new PartitionKey(group.Key);
            
            // Process items in the same partition together
            var tasks = group.Select(item => 
                container.CreateItemAsync(item, partitionKey));
            
            results.AddRange(tasks);
        }
        
        var responses = await Task.WhenAll(results);
        
        return new BulkOperationResult
        {
            SuccessCount = responses.Count(r => r.StatusCode == HttpStatusCode.Created),
            TotalRUs = responses.Sum(r => r.RequestCharge),
            FailedItems = responses.Where(r => r.StatusCode != HttpStatusCode.Created).ToList()
        };
    }
}

🔍 Monitoring and Diagnostics

1. Partition Metrics Monitoring

public class PartitionMonitor
{
    public async Task<PartitionMetrics> AnalyzePartitionHealth(Container container)
    {
        var metrics = new PartitionMetrics();
        
        // Monitor hot partitions
        var hotPartitions = await IdentifyHotPartitions(container);
        metrics.HotPartitions = hotPartitions;
        
        // Monitor partition size distribution
        var sizeDistribution = await GetPartitionSizeDistribution(container);
        metrics.SizeDistribution = sizeDistribution;
        
        // Monitor cross-partition query frequency
        var crossPartitionQueryRate = await GetCrossPartitionQueryRate(container);
        metrics.CrossPartitionQueryRate = crossPartitionQueryRate;
        
        return metrics;
    }
    
    private async Task<List<string>> IdentifyHotPartitions(Container container)
    {
        // Implementation would use Azure Monitor or custom telemetry
        // to identify partitions with high RU consumption
        return new List<string>();
    }
}

public class PartitionMetrics
{
    public List<string> HotPartitions { get; set; } = new();
    public Dictionary<string, long> SizeDistribution { get; set; } = new();
    public double CrossPartitionQueryRate { get; set; }
    public double AveragePartitionSize => SizeDistribution.Values.Average();
    public string LargestPartition => SizeDistribution.OrderByDescending(kvp => kvp.Value).First().Key;
}

2. Performance Alerting

public class PartitioningAlerts
{
    public static void SetupAlerts()
    {
        // Alert on hot partitions (>80% of total RUs)
        // Alert on large partitions (>15GB)
        // Alert on high cross-partition query ratio (>50%)
        // Alert on partition key skew (Gini coefficient >0.7)
    }
}

🚀 Advanced Partitioning Techniques

For complex, high-volume applications, traditional single-collection approaches may not provide optimal performance. This section covers advanced architectural patterns that combine partitioning strategies with sophisticated data lifecycle management.

1. Hot/Warm Architecture with TTL ✅ HIGHLY RECOMMENDED

This approach separates current/active data from historical data using two collections with different optimization strategies and automatic data lifecycle management.

Architecture Overview

// HOT Collection: Current data (last 30-45 days)
// - Optimized for writes and recent queries
// - TTL enabled for automatic cleanup
// - Higher RU allocation

// WARM Collection: Historical data (older than 30-45 days)  
// - Optimized for analytical queries
// - No TTL - permanent storage
// - Lower RU allocation

public class HotWarmFeedArchitecture
{
    private readonly Container _currentContainer;    // Hot: Recent data
    private readonly Container _historicalContainer; // Warm: Historical data
    private readonly TimeSpan _migrationThreshold = TimeSpan.FromDays(30);
    
    public HotWarmFeedArchitecture(
        Container currentContainer, 
        Container historicalContainer)
    {
        _currentContainer = currentContainer;
        _historicalContainer = historicalContainer;
    }
}

Container Configuration

public static async Task<(Container current, Container historical)> 
    SetupHotWarmContainers(Database database)
{
    // HOT Container: Optimized for real-time operations
    var currentContainerProperties = new ContainerProperties
    {
        Id = "feeds_current",
        PartitionKeyPath = "/partitionKey",
        
        // TTL safety net - automatic cleanup after 45 days
        DefaultTimeToLive = (int)TimeSpan.FromDays(45).TotalSeconds,
        
        IndexingPolicy = new IndexingPolicy
        {
            // Aggressive indexing for real-time queries
            Automatic = true,
            IndexingMode = IndexingMode.Consistent,
            CompositeIndexes = 
            {
                new Collection<CompositePath>
                {
                    new CompositePath { Path = "/feedProviderId", Order = CompositePathSortOrder.Ascending },
                    new CompositePath { Path = "/publishedDate", Order = CompositePathSortOrder.Descending }
                }
            }
        }
    };
    
    // Higher throughput for current data
    var currentContainer = await database.CreateContainerIfNotExistsAsync(
        currentContainerProperties, 
        ThroughputProperties.CreateAutoscaleThroughput(4000));
    
    // WARM Container: Optimized for analytics
    var historicalContainerProperties = new ContainerProperties
    {
        Id = "feeds_historical",
        PartitionKeyPath = "/partitionKey",
        
        // No TTL - permanent storage
        DefaultTimeToLive = null,
        
        IndexingPolicy = new IndexingPolicy
        {
            // Selective indexing for analytical queries
            Automatic = true,
            IndexingMode = IndexingMode.Consistent,
            ExcludedPaths = 
            {
                new ExcludedPath { Path = "/content/*" } // Exclude large content
            }
        }
    };
    
    // Lower throughput for historical data
    var historicalContainer = await database.CreateContainerIfNotExistsAsync(
        historicalContainerProperties, 
        ThroughputProperties.CreateManualThroughput(800));
        
    return (currentContainer, historicalContainer);
}

Smart Query Routing

public class SmartQueryRouter
{
    public async Task<IEnumerable<FeedItem>> GetFeedsAsync(
        string providerId = null,
        DateTime? fromDate = null,
        DateTime? toDate = null)
    {
        var from = fromDate ?? DateTime.UtcNow.AddDays(-7);
        var to = toDate ?? DateTime.UtcNow;
        var hotThreshold = DateTime.UtcNow.Subtract(_migrationThreshold);
        
        var results = new List<FeedItem>();
        
        // Route queries to appropriate container(s)
        if (to > hotThreshold)
        {
            // Query hot container for recent data
            var hotResults = await QueryContainer(_currentContainer, providerId, 
                from > hotThreshold ? from : hotThreshold, to);
            results.AddRange(hotResults);
        }
        
        if (from < hotThreshold)
        {
            // Query warm container for historical data
            var warmResults = await QueryContainer(_historicalContainer, providerId, 
                from, to < hotThreshold ? to : hotThreshold);
            results.AddRange(warmResults);
        }
        
        return results.OrderByDescending(f => f.PublishedDate);
    }
}

Benefits

✅ Optimal Performance: 80% of queries hit small, fast hot container
✅ Automatic Cleanup: TTL ensures hot container stays lean
✅ Cost Efficient: Different RU allocation per container
✅ Failsafe: TTL prevents data accumulation if migration fails
✅ Independent Scaling: Optimize each container separately

2. Near Real-Time Data Migration ✅ RECOMMENDED

Continuous migration of data from hot to warm storage eliminates the need for bulk operations and maintains consistent performance.

Background Migration Service

public class ContinuousMigrationService : BackgroundService
{
    private readonly Container _hotContainer;
    private readonly Container _warmContainer;
    private readonly TimeSpan _migrationAge = TimeSpan.FromDays(30);
    private readonly TimeSpan _migrationInterval = TimeSpan.FromMinutes(15);
    
    protected override async Task ExecuteAsync(CancellationToken stoppingToken)
    {
        while (!stoppingToken.IsCancellationRequested)
        {
            try
            {
                await MigrateEligibleData();
                await Task.Delay(_migrationInterval, stoppingToken);
            }
            catch (Exception ex)
            {
                // Log error and continue
                await Task.Delay(TimeSpan.FromMinutes(5), stoppingToken);
            }
        }
    }
    
    private async Task MigrateEligibleData()
    {
        var cutoffDate = DateTime.UtcNow.Subtract(_migrationAge);
        var eligiblePartitions = GetPartitionsOlderThan(cutoffDate);
        
        foreach (var partition in eligiblePartitions)
        {
            await MigratePartition(partition, cutoffDate);
        }
    }
    
    private async Task MigratePartition(string partition, DateTime cutoffDate)
    {
        // Query eligible items for migration
        var query = new QueryDefinition(@"
            SELECT * FROM c 
            WHERE c.publishedDate < @cutoffDate
            AND (NOT IS_DEFINED(c.migrated) OR c.migrated = false)")
            .WithParameter("@cutoffDate", cutoffDate);
        
        var iterator = _hotContainer.GetItemQueryIterator<FeedItem>(
            query,
            requestOptions: new QueryRequestOptions
            {
                PartitionKey = new PartitionKey(partition),
                MaxItemCount = 100
            });
        
        while (iterator.HasMoreResults)
        {
            var response = await iterator.ReadNextAsync();
            var migrationTasks = response.Select(MigrateSingleItem);
            await Task.WhenAll(migrationTasks);
        }
    }
    
    private async Task MigrateSingleItem(FeedItem item)
    {
        try
        {
            // 1. Copy to warm container (remove TTL)
            var warmItem = CloneForWarmStorage(item);
            await _warmContainer.CreateItemAsync(warmItem, new PartitionKey(warmItem.PartitionKey));
            
            // 2. Mark as migrated in hot container with short TTL
            await _hotContainer.PatchItemAsync<FeedItem>(
                item.Id,
                new PartitionKey(item.PartitionKey),
                new[]
                {
                    PatchOperation.Add("/migrated", true),
                    PatchOperation.Replace("/ttl", (int)TimeSpan.FromDays(1).TotalSeconds)
                });
        }
        catch (CosmosException ex) when (ex.StatusCode == HttpStatusCode.Conflict)
        {
            // Item already migrated - mark in hot container
            await MarkAsMigrated(item);
        }
    }
}

Benefits

✅ Continuous Operation: No bulk migration windows
✅ Balanced Load: Spreads migration work over time
✅ Self-Healing: Recovers from migration failures
✅ Configurable: Easy to adjust thresholds and intervals

3. Single Collection with Time-Based Partitioning ⚠️ ACCEPTABLE

A traditional approach using a single collection with intelligent partition key design.

Implementation

public class SingleCollectionApproach
{
    private readonly Container _feedsContainer;
    
    public async Task<FeedItem> StoreFeedAsync(FeedItem feed)
    {
        // Use hybrid partitioning: provider_month
        feed.PartitionKey = $"{feed.FeedProviderId}_{feed.PublishedDate:yyyy-MM}";
        
        return await _feedsContainer.UpsertItemAsync(
            feed, 
            new PartitionKey(feed.PartitionKey));
    }
    
    public async Task<IEnumerable<FeedItem>> GetFeedsAcrossTime(
        string providerId,
        DateTime fromDate,
        DateTime toDate)
    {
        // Single query can span multiple time periods
        var partitions = GetPartitionsForDateRange(providerId, fromDate, toDate);
        var allResults = new List<FeedItem>();
        
        foreach (var partition in partitions)
        {
            var query = new QueryDefinition(@"
                SELECT * FROM c 
                WHERE c.feedProviderId = @providerId 
                AND c.publishedDate >= @fromDate 
                AND c.publishedDate <= @toDate
                ORDER BY c.publishedDate DESC")
                .WithParameter("@providerId", providerId)
                .WithParameter("@fromDate", fromDate)
                .WithParameter("@toDate", toDate);
            
            var iterator = _feedsContainer.GetItemQueryIterator<FeedItem>(
                query,
                requestOptions: new QueryRequestOptions
                {
                    PartitionKey = new PartitionKey(partition)
                });
            
            while (iterator.HasMoreResults)
            {
                var response = await iterator.ReadNextAsync();
                allResults.AddRange(response);
            }
        }
        
        return allResults.OrderByDescending(f => f.PublishedDate);
    }
}

Benefits and Limitations

✅ Simplicity: Single container to manage
✅ Cross-time queries: Natural query spanning
⚠️ Growth: Container grows indefinitely
⚠️ Performance: May degrade as data volume increases
⚠️ Archival: Complex partition-level archival required

4. Multiple Collections by Year ❌ NOT RECOMMENDED

Creating separate collections for each year of data.

Why This Approach is Problematic

public class MultiYearCollectionApproach
{
    private readonly Database _database;
    private readonly Dictionary<int, Container> _yearContainers = new();
    
    public async Task<IEnumerable<FeedItem>> GetFeedsAcrossYears(
        string providerId,
        DateTime fromDate,
        DateTime toDate)
    {
        // COMPLEX: Must query multiple containers
        var yearsToQuery = GetYearsInRange(fromDate, toDate);
        var allResults = new List<FeedItem>();
        
        // Sequential queries - PERFORMANCE IMPACT
        foreach (var year in yearsToQuery)
        {
            var container = await GetContainerForYear(year);
            var yearResults = await QueryYearContainer(container, providerId, fromDate, toDate);
            allResults.AddRange(yearResults);
        }
        
        return allResults.OrderByDescending(f => f.PublishedDate);
    }
    
    private async Task<Container> GetContainerForYear(int year)
    {
        if (!_yearContainers.ContainsKey(year))
        {
            // OVERHEAD: Must create/manage multiple containers
            var containerProperties = new ContainerProperties
            {
                Id = $"feeds_{year}",
                PartitionKeyPath = "/partitionKey"
            };
            
            _yearContainers[year] = await _database.CreateContainerIfNotExistsAsync(
                containerProperties, 
                ThroughputProperties.CreateManualThroughput(400)); // COST: Minimum RU per container
        }
        
        return _yearContainers[year];
    }
}

Problems with This Approach

❌ Complex Queries: Cross-year queries require application-level joins
❌ Higher Costs: Each container needs minimum RU allocation
❌ Operational Overhead: Multiple containers to monitor and manage
❌ Schema Evolution: Different containers may have different schemas over time
❌ Performance: Sequential queries instead of parallel partitioning

📊 Advanced Techniques Comparison

Approach	Complexity	Performance	Cost	Maintenance	Scalability
Hot/Warm + TTL	🟡 Medium	🟢 Excellent	🟢 Optimal	🟢 Low	🟢 Excellent
Near Real-Time Migration	🟡 Medium	🟢 Very Good	🟢 Good	🟡 Medium	🟢 Excellent
Single Collection	🟢 Low	🟡 Good	🟡 Good	🟢 Low	🟡 Limited
Multiple Collections	🔴 High	🔴 Poor	🔴 Expensive	🔴 High	🔴 Poor

🎯 Recommendation Hierarchy

Tier 1: Production-Ready Solutions

Hot/Warm Architecture + TTL + Near Real-Time Migration ⭐ BEST
- Optimal for high-volume, time-sensitive data
- Self-managing and cost-efficient
- Provides best performance for typical query patterns

Tier 2: Acceptable for Smaller Scale

Single Collection with Hybrid Partitioning
- Good for moderate volumes (< 1TB total)
- Simpler to implement and maintain
- Consider migration to Tier 1 as scale increases

Tier 3: Avoid in Production

Multiple Collections by Time Period
- Only consider for very specific edge cases
- High operational overhead and complexity
- Better alternatives available in Tiers 1-2

🛠️ Implementation Decision Framework

Use this framework to choose the right approach:

Data Volume > 500GB AND High Query Load?
├─ YES: Use Hot/Warm Architecture + TTL
└─ NO: Continue...

Need Cross-Time Analytics AND Real-Time Performance?
├─ YES: Use Hot/Warm with Near Real-Time Migration  
└─ NO: Continue...

Simple Requirements AND Small Scale (< 100GB)?
├─ YES: Use Single Collection
└─ NO: Reconsider Hot/Warm Architecture

Multiple Time Periods with Independent Management?
├─ YES: Carefully consider Multiple Collections (usually NOT recommended)
└─ NO: Use Single Collection or Hot/Warm

The Hot/Warm Architecture with TTL and Near Real-Time Migration provides the best balance of performance, cost, and operational simplicity for most production scenarios involving time-series or feed data.

Scenario Overview

Database: diginsight-cdb-testlive-01
Collection: feeds
Requirements:

Multiple feed providers with varying volumes
Time-based access patterns (recent feeds prioritized)
Mixed query patterns (crawlers, indexers, user queries)
Configurable archival (older feeds moved to archive storage)
Performance optimization for recent data

Partition Strategy Analysis

Option 1: Time-Based Partitioning (Monthly) - RECOMMENDED

public class FeedItem
{
    [JsonProperty("id")]
    public string Id { get; set; } = Guid.NewGuid().ToString();
    
    [JsonProperty("partitionKey")]
    public string PartitionKey { get; set; } // Format: "YYYY-MM"
    
    public string FeedProviderId { get; set; }
    public string Title { get; set; }
    public string Content { get; set; }
    public DateTime PublishedDate { get; set; }
    public DateTime CrawledDate { get; set; } = DateTime.UtcNow;
    public string SourceUrl { get; set; }
    public string[] Tags { get; set; }
    public FeedMetadata Metadata { get; set; }
}

public static class FeedPartitionStrategy
{
    public static string GetPartitionKey(DateTime publishedDate)
        => publishedDate.ToString("yyyy-MM"); // e.g., "2025-10"
    
    public static IEnumerable<string> GetPartitionsForDateRange(DateTime from, DateTime to)
    {
        var current = new DateTime(from.Year, from.Month, 1);
        var end = new DateTime(to.Year, to.Month, 1);
        
        while (current <= end)
        {
            yield return current.ToString("yyyy-MM");
            current = current.AddMonths(1);
        }
    }
}

Benefits:

✅ Predictable Growth: New partitions created monthly
✅ Recent Data Optimization: Most queries target 1-2 recent partitions
✅ Easy Archival: Archive entire partitions older than X years
✅ Balanced Distribution: Even distribution over time
✅ Query Efficiency: Temporal queries are highly efficient

Trade-offs:

⚠️ Provider-Specific Queries: Require cross-partition queries
⚠️ Hot Partition: Current month receives all new writes

Option 2: Hybrid Provider-Time Partitioning

public static class HybridFeedPartitioning
{
    private static readonly HashSet<string> HighVolumeProviders = new()
    {
        "reuters", "ap", "bbc", "cnn", "bloomberg"
    };
    
    public static string GetPartitionKey(string providerId, DateTime publishedDate)
    {
        if (HighVolumeProviders.Contains(providerId.ToLower()))
        {
            // High-volume providers get monthly partitions
            return $"{providerId}_{publishedDate:yyyy-MM}";
        }
        else
        {
            // Low-volume providers get quarterly partitions
            int quarter = (publishedDate.Month - 1) / 3 + 1;
            return $"{providerId}_{publishedDate.Year}-Q{quarter}";
        }
    }
}

Benefits:

✅ Provider Isolation: Efficient provider-specific queries
✅ Adaptive Granularity: Different time granularity based on volume
✅ Reduced Cross-Partition Queries: Provider queries hit single partition

Trade-offs:

⚠️ Complex Management: More complex partition key logic
⚠️ Provider Imbalance: Popular providers may still create hot partitions

Option 3: GUID Partitioning (NOT RECOMMENDED)

// ❌ AVOID: Creates hyperfragmented partitions
public class HyperfragmentedFeed
{
    [JsonProperty("id")]
    public string Id { get; set; } = Guid.NewGuid().ToString();
    
    [JsonProperty("partitionKey")]
    public string PartitionKey => Id; // BAD: One partition per feed item
}

Problems:

❌ Poor Query Performance: All queries become cross-partition
❌ High RU Costs: Expensive aggregations and filters
❌ No Locality: Related feeds scattered across partitions
❌ Archival Complexity: Cannot easily identify old data

Implementation Recommendation

public interface IFeedStorageService
{
    Task<FeedItem> StoreFeedAsync(FeedItem feed);
    Task<IEnumerable<FeedItem>> GetRecentFeedsAsync(string providerId = null, int days = 7);
    Task<IEnumerable<FeedItem>> SearchFeedsAsync(string searchTerm, DateTime? from = null, DateTime? to = null);
    Task<IEnumerable<FeedItem>> GetFeedsByProviderAsync(string providerId, DateTime from, DateTime to);
    Task ArchiveOldFeedsAsync(int archiveAfterYears = 2);
}

public class FeedStorageService : IFeedStorageService
{
    private readonly Container _container;
    
    public FeedStorageService(Container container)
    {
        _container = container;
    }
    
    public async Task<FeedItem> StoreFeedAsync(FeedItem feed)
    {
        // Use time-based partitioning strategy
        feed.PartitionKey = FeedPartitionStrategy.GetPartitionKey(feed.PublishedDate);
        
        return await _container.UpsertItemAsync(
            feed, 
            new PartitionKey(feed.PartitionKey));
    }
    
    public async Task<IEnumerable<FeedItem>> GetRecentFeedsAsync(string providerId = null, int days = 7)
    {
        var fromDate = DateTime.UtcNow.AddDays(-days);
        var partitions = FeedPartitionStrategy.GetPartitionsForDateRange(fromDate, DateTime.UtcNow);
        
        var allFeeds = new List<FeedItem>();
        
        foreach (var partition in partitions)
        {
            var queryText = providerId != null 
                ? "SELECT * FROM c WHERE c.feedProviderId = @providerId AND c.publishedDate >= @fromDate ORDER BY c.publishedDate DESC"
                : "SELECT * FROM c WHERE c.publishedDate >= @fromDate ORDER BY c.publishedDate DESC";
            
            var query = new QueryDefinition(queryText)
                .WithParameter("@fromDate", fromDate);
            
            if (providerId != null)
                query.WithParameter("@providerId", providerId);
            
            var iterator = _container.GetItemQueryIterator<FeedItem>(
                query,
                requestOptions: new QueryRequestOptions
                {
                    PartitionKey = new PartitionKey(partition),
                    MaxItemCount = 100
                });
            
            while (iterator.HasMoreResults)
            {
                var response = await iterator.ReadNextAsync();
                allFeeds.AddRange(response);
            }
        }
        
        return allFeeds.OrderByDescending(f => f.PublishedDate);
    }
    
    public async Task ArchiveOldFeedsAsync(int archiveAfterYears = 2)
    {
        var cutoffDate = DateTime.UtcNow.AddYears(-archiveAfterYears);
        var archivePartitions = FeedPartitionStrategy.GetPartitionsForDateRange(
            new DateTime(2020, 1, 1), 
            cutoffDate);
        
        foreach (var partition in archivePartitions)
        {
            // Move entire partition to archive storage
            await ArchivePartition(partition, cutoffDate);
        }
    }
    
    private async Task ArchivePartition(string partition, DateTime cutoffDate)
    {
        // Implementation would:
        // 1. Query all items in the partition
        // 2. Copy to archive container/storage account
        // 3. Delete from main container
        // 4. Update metadata about archived partitions
    }
}

Container Configuration

public static async Task<Container> SetupFeedsContainer(Database database)
{
    var containerProperties = new ContainerProperties
    {
        Id = "feeds",
        PartitionKeyPath = "/partitionKey",
        
        IndexingPolicy = new IndexingPolicy
        {
            Automatic = true,
            IndexingMode = IndexingMode.Consistent,
            IncludedPaths = { new IncludedPath { Path = "/*" } },
            ExcludedPaths = 
            {
                new ExcludedPath { Path = "/content/*" }, // Exclude large content
                new ExcludedPath { Path = "/metadata/rawData/*" }
            },
            CompositeIndexes = 
            {
                // Optimize for provider + time queries
                new Collection<CompositePath>
                {
                    new CompositePath { Path = "/feedProviderId", Order = CompositePathSortOrder.Ascending },
                    new CompositePath { Path = "/publishedDate", Order = CompositePathSortOrder.Descending }
                },
                // Optimize for time-based queries
                new Collection<CompositePath>
                {
                    new CompositePath { Path = "/publishedDate", Order = CompositePathSortOrder.Descending },
                    new CompositePath { Path = "/crawledDate", Order = CompositePathSortOrder.Descending }
                }
            }
        }
    };
    
    // Use autoscale for variable feed ingestion loads
    return await database.CreateContainerIfNotExistsAsync(
        containerProperties, 
        ThroughputProperties.CreateAutoscaleThroughput(4000)); // Max 4000 RU/s
}

Performance Characteristics

Query Pattern	Partitions Hit	RU Estimate	Performance
Recent feeds (7 days)	1-2	10-50 RUs	✅ Excellent
Provider feeds (30 days)	1-2	20-100 RUs	✅ Good
Search across 3 months	3	50-200 RUs	✅ Good
All providers (recent)	1-2	50-300 RUs	✅ Good
Cross-provider analytics	Multiple	200+ RUs	⚠️ Moderate

Summary

For the feed database scenario, monthly time-based partitioning is the optimal strategy because:

🎯 Query Alignment: Most queries target recent data (last few months)
📈 Scalable Growth: New partitions created predictably over time
🗄️ Simple Archival: Archive entire old partitions
⚖️ Balanced Load: Even distribution of data over time
💰 Cost Effective: Efficient RU consumption for common queries

This approach provides the best balance of performance, maintainability, and cost-effectiveness for the feed aggregation use case.